pieter abbeel win 2021
Pieter Abbeel Wins 2021 ACM Prize In Computing
Abbeel is considered the leader of his generation in RL, especially as it pertains to robotics. His work on trust-region policy optimization provided the first reliable RL procedure for continuous control, as showcased on simulated robotic environments. Further from there, Abbeel has made several other pioneering contributions to Deep RL for robotics. These contributions include generalized advantage estimation, which enabled the first 3D robot locomotion learning; soft-actor critic, which is one the most popular Deep RL algorithms to-date; domain randomization, which showcases how learning across appropriately randomized simulators can generalize surprisingly well to the real world; and hindsight experience replay, which has been instrumental for Deep RL in sparse-reward/goal-oriented environments.